TREC 11 Experiments at NII: The Effects of Virtual Relevant Documents in Batch Filtering

نویسندگان

  • Kyung-Soon Lee
  • Kyo Kageura
  • Akiko Aizawa
چکیده

Researches on document retrieval, text categorization and routing have shown the effects of learning by sampling relevant documents or non-relevant document from training set. Allan et al. (1995) considered only the top K non-relevant documents, which is the same number of all known relevant documents in the training set to learn a routing query. This is motivated by the need to have a balance between the number of the relevant and the negative documents in Rocchio’s learning. Singhal et al. (1997) selectively used the nonrelevant documents that belong to a query’s domain to learn the feedback query. Kwok and Grunfeld (1997) selected the best training subset of the relevant documents for creation of a feedback query based on genetic algorithm. Most sampling techniques in machine learning aim at the reducing the size of training set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The TREC 2002 Filtering Track Report

The TREC–11 filtering track measures the ability of systems to build persistent user profiles which successfully separate relevant and non-relevant documents in an incoming stream. It consists of three major subtasks: adaptive filtering, batch filtering, and routing. In adaptive filtering, the system begins with only a topic statement and a small number of positive examples, and must learn a be...

متن کامل

TREC-10 Experiments at KAIST: Batch Filtering and Question Answering

1.Introduction In TREC-10, we participated in two tasks: batch filtering task in the filtering task, and question answering task. In question answering task, we participated in three sub-tasks (main task, list task, and context task). In batching filtering task, we experimented a filtering technique, which unifies the results of support vector machines for subtopics subdivided by incremental cl...

متن کامل

TREC 11 Experiments at CAS-ICT: Filtering and Web

CAS-ICT took part in the TREC conference for the second time this year and we undertook two tracks of TREC-11. For filtering track, we have submitted results of all three subtasks. In adaptive filtering, we paid more attention to undetermined documents processing, profile building and adaptation. In batch filtering and routing, a centroid-based classifier is used with preprocessed samples. For ...

متن کامل

The TREC–9 Filtering Track Final Report

The TREC–9 filtering track measures the ability of systems to build persistent user profiles which successfully separate relevant and non-relevant documents. It consists of three major subtasks: adaptive filtering, batch filtering, and routing. In adaptive filtering, the system begins with only a topic statement and a small number of positive examples, and must learn a better profile from on-li...

متن کامل

The TREC - 8 Filtering Track Final

The TREC-8 ltering track measures the ability of systems to build persistent user prooles which successfully separate relevant and non-relevant documents. It consists of three major subtasks: adaptive ltering, batch ltering, and routing. In adaptive ltering, the system begins with only a topic statement and must learn a better proole from on-line feedback. Batch ltering and routing are more tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002